A mirror of all 126,900 sounds on Freesound less than 4 seconds long, as of April 4, 2017. Metadata for all sounds is stored in the json.zip files, and the high quality mp3s are stored in the mp3.zip files.
A processed subset of the samples is available as a numpy matrix of 250ms fingerprints, in fingerprints.npy. The processing to get from the raw audio to the numpy matrix is shown in the AudioNotebooks repo. Specifically Collect Samples.ipynb and Samples to Fingerprints.ipynb. The associated filenames and metadata for the processed samples is available in fingerprints.filenames.txt.zip and fingerprints.filenames.txt.zip. In the processing we lose 2,400 sounds due to loading errors, processing errors, or silence.
Corresponding labels for the fingerprints are available in labels_mapping.zip as two .pkl files that map from fingerprint index to synset indices, and from synset index to fingerprint indices. The labels were built with Metadata to Labels.ipynb, which uses the synset in synset.json to tag samples that include the provided tokens in their description or tags. Synsets were selected based on frequently occurring tokens.
Each audio file has its own license, specified in the corresponding JSON file. Approximately 15% have a non-commercial license, either CC-NC (11%) or Sampling+ (4%) and the rest are either CC-Attribution (42%) or CC-0 (40%).
Note that a disproportionately large number of these sounds (more than 5%) are from the 7k sample Mridangam Stroke Dataset.
For attribution in academic contexts, please cite this dataset as:
McDonald, "Freesound 4 Seconds", Internet Archive, 2017. https://archive.org/details/freesound4s